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Abstract 

The stabilities of standardized (P) and structure (r s ) coefficients in canonical (CA) and 
discriminant analyses (DA) were studied. Four different situations were studied — two 
pertaining to CA and two to DA. The situations were meant to represent "somewhat typical” and 
yet varying research conditions that often would not be thought to be notably objectionable 
among informed users of CA and DA. 

Data were sampled from a real population. For each of three situations, 100 random 
samplings of size 100 each were performed. In each sampling P and r s were computed, and were 
subsequently treated so that their stabilities could be evaluated. For one of the situations, 100 
random samplings of size 150 each were performed. 

Relative to the situations studied, conflicting results occurred concerning the stabilities of 
the P and r s values. Conflicting results also occurred concerning the difference in stability 
between Roots 1 and 2 for both statistics. Furthermore, and more alarming, the stabilities of both 
statistics under the "reasonable" conditions studied were low. The results were discussed. 

Further Inquiry into the Stabilities of Standardized and Structure 
Coefficients in Canonical and Discriminant Analyses 
Perspective and Point of View 

Much evidence exists relative to the current interest in canonical and discriminant 
analyses concerning their relevance to data analysis pertaining to research in a variety of areas. 

A sampling of the evidence is as follows: 

1. Numerous articles and books published and papers presented during the past several 
years relative to these multivariate methods. The scholarly works regarding the methods may be 
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found in social sciences, education, medical sciences, business, physical sciences, engineering, 
and other areas. Several works of classical significance also exist. 

2. Papers produced by the following writers that are especially relevant to this paper: 
Heidgerken (1999); Roberts (1999); Strand (1999); Humphries-Wadsworth (1998); Thompson 
(1984, 1991, 1993, 1995a, 1995b, 1998); Kier (1997); Pedhazur (1997); Whitaker (1997); Gray, 
Baek, Woodward, Miller, and Fisk (1996); Jonathan, McCarthy, and Roberts (1996); Thomas 
and Zumbo (1996); Van de Geer (1996); Bewley and Yang (1995); Fan and Wang (1995); Fok 
and Fok (1995); Liang, Krus, and Webb (1995); Millns, Woodward, and Bolton Smith (1995); 
OGorman and Woolson (1995); Seo, Kanda, and Fujikoshi (1995); Tritchler (1995); Watts 
(1995); Yokoyama (1995); Beasley and Sheehan (1994); Cole, Maxwell, Arvey, and Salas 
(1994); Crossman (1994); Huberty (1975, 1994); Kingman and Zion (1994); Sadek and Huberty 
(1994); Hutchinson (1993); Kaplan and Wenger (1993); Kirisci and Hsu (1993); Mueller and 
Cozad (1993); Campbell and Tucker (1992); Chant and Dalgleish (1992); Harris (1989, 1992); 
Huberty and Wisenbaker (1992); Romanazzi (1992); Strand, Cahill, and Dirks (1992); Taylor 
(1992); Thomas (1992); Friedrich (1991); Joachimsthaler and Stam (1990); Thorndike and Weiss 
(1973); and Barcikowski and Stevens (1975). 

Multivariate analysis of variance (MANOVA), discriminant analysis (DA), and canonical 
analysis (CA) are multivariate statistical techniques that are related to one another. MANOVA 
generally pertains to the relationship between one or more categorical independent (X) variables, 
and multiple continuous dependent (Y) variables. 

DA generally pertains to the relationship between a single categorical Y variable and 
multiple continuous X variables. However, many users feel comfortable about the use of 
categorical X variables as long as they are coded appropriately. Furthermore, one may look 
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upon a discriminant analysis as a "flip side" of a one-way MANOVA — the single categorical X 
variable in a one-way MANOVA may be the single categorical Y variable in the corresponding 
DA, and the multiple Y variables in the one-way MANOVA may be the multiple X variables in 
the corresponding DA. 

CA was initially developed in part to determine the relationship between a set of multiple 
continuous X variables and a set of multiple continuous Y variables. As practitioners became 
more knowledgeable about CA, some practitioners have become comfortable relative to the use 
of categorical variables as long as they are coded appropriately. Furthermore, other statistical 
techniques such as ANOVA, some MANOVAs and DAs, multiple regression analysis, and 
correlation may be looked upon as special cases of CA. 

Wilks' lambda (A) pertains to the relationship between the X and Y variable composites 
taking into account all the solutions or roots. 

Standardized coefficients, fl pertain to the relationship between a variable in one set and 
a variable in or variate (variable composite) of the other set controlling for the other variables in 
its own set. Structure coefficients, r s , pertain to the relationship between a variable and the 
variate of its own set. These statistics are produced for each solution relative to a canonical or 
discriminant analysis. Furthermore, these statistics are frequently utilized in practice, and reports 
relative to their stability (degree to which their values may be cross validated across samplings) 
exist. However, the information that exists reflects conflicting points of view among notable 
statisticians — for example, Tardif and Hardy (1995), Thompson (1984, 1991), Huberty (1975), 
and Barcikowski and Stevens (1975). While Thompson supported a point of view that structure 
coefficients are generally not necessarily more stable than standardized coefficients (Thompson 
referred to these as function coefficients), most theorists have argued and provided evidence that 
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they are indeed more stable than the P values. Furthermore, the difference in the stability of 
these statistics across different solutions has received little attention. 

That the value of standardized coefficients is questioned in multivariate analysis is not 
surprising in that several statisticians have looked upon them in a negative manner even in the 
use of relatively simpler multiple linear regression analysis [see Darlington (1990), for example]. 

Some of the previous studies were Monte Carlo studies relative to which samples were 
computer generated. Real data sets were utilized in other sampling studies. 

A variety of approaches have been taken in evaluating the relative stabilities of the r s and 
P coefficients (Strand, 1999; Fan & Wang, 1995; Tardif & Hardy, 1995; Thompson, 1991; 
Thorndike & Weiss, 1973; Barcikowski and Stevens, 1975; and Huberty, 1975). 

The first writer's inquiry began several years ago — resulting in related papers being 
presented at the annual meeting of the American Educational Research Association in 1998 and 
1999 as well as other meetings. The inquiry initially focused on the relative stabilities of 
standardized and structure coefficients, and later included comparisons with the predictably more 
stable Wilks' lambda. Additional inquiry was made into the relative stability of the two statistics 
across multiple solutions. 

These pursuits led to observations of the consistent and alarming lack of stability of both 
statistics even when the conditions that are required for unambiguous interpretation of the 
statistics appeared to be minimally violated. In the last set of studies, Strand (1999) selected 
variables for study (a) whose absolute skewness values generally fell below 1.00, (b) that 
resulted in relatively low collinearity, and (c) that were at least moderately related to variables in 
the other variable set. However, one of the 18 variables that were studied had a skewness value 
of 2.53 and kurtosis value of 18.53. Another variable had respective skewness and kurtosis 
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values of -1.57 and 16.17. Relative to collinearity, three of the within-set r values exceeded 
.60 — the greater of the two being .73. Furthermore, for the DA conditions studied, the 
relationship between the grouping variable and the linear combination of the discriminating 
variables was not high — the respective population A values were .91 and .98. The results of 
these delimited studies added somewhat to the argument that structure coefficients are more 
stable than the corresponding standardized coefficients as well as providing consistent and 
alarming evidence of the relative instability of both statistics. 

In this current and again descriptive study the writers have attempted to select even more 
appropriate variables for study — variables whose skewness values and collinearity are even 
lower than in the previous studies, and variables that are even more related to variables in the 
other set of variables. 

Hypotheses 

The hypotheses relative to this paper are as follows: 

1. Relative to each of CA and DA, the stability of r s is greater than that of a 
corresponding p. 

2. The stabilities of the P and r s coefficients decrease in each subsequent solution or root. 

3. The stabilities of both statistics are suitable under the conditions of low skewness and 
low collinearity, and relatively high relationship between the two sets of variables. 

Method 

Population 

One population was utilized in this study. The population contained 1517 cases. One 
hundred samplings of size 100 or 150 (for the second set of DA runs only) were each performed 
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for each of the following (the total sample sizes of 100 and 150 were selected so that the usable 
sample for each run after cases were eliminated because of missing data was at least 50): 

1. DA mns concerning the first set of variables (NC, NS, A, HS, OPR, and CEDUC) 
relative to which A, p, r v , and other values were produced. 

2. CA runs concerning the first set of variables (HA, NC, NS, L, and E) relative to which 
A, p, r s , and other values were produced. 

3. DA mns concerning the second set of variables (OPR, OB, and COCAT) relative to 
which A, p, r s , and other values were produced. 

4. CA runs concerning the second set of variables (OCCA, NS, M, OPR, and PA) 
relative to which A, p, r s , and other values were produced. 

The first writer's experiences with previous studies showed that the results based on 
sample sizes of 50 and 100 were similar. Accordingly, sample sizes of 50 or more would likely 
produce results that are similar to what would be produced with moderately larger sample sizes. 

The variables were selected in order to represent "somewhat typical" research conditions 
relative to which the researcher exercised appropriate caution in selecting the variables as 
follows: 

1. An attempt was made to select all the continuous variables according to a criterion 
that their skewness values fell between -1.00 and 1.00, and all their kurtosis values fell below 
4.00. Informed users of CA and DA are sensitive to the requirements for valid statistics and 
tests, and attempt to avoid variables whose distributions are markedly skewed. Table 1 contains 
skewness and kurtosis values for all the continuous variables selected for use. Most, but not all, 
of the variables met the criteria. NC and NS surpassed the upper criterion for skewness by a 


relatively small amount. 



FURTHER INQUIRY INTO THE STABILITIES 


8 


2. All the variables selected for the CA runs and all the discriminating variables selected 
for the DA runs were continuous variables. 

Table 1 

Skewness and Kurtosis Values for Continuous Variables 


Variable 

DA runs 

Skewness 

Kurtosis 

Variable 

CA runs 

Skewness 

Kurtosis 



Variable Set 1 




NC 

1.03 

1.06 

HA 

0.16 

-0.53 

NS 

1.47 

3.51 

NC 

1.03 

1.06 

A 

0.52 

-0.79 

NS 

1.47 

3.51 

HS 

-0.24 

1.10 

L 

0.29 

-0.79 

OPR 

0.44 

-0.37 

E 

-0.17 

-0.71 



Variable Set 2 




OPR 

0.44 

-0.37 

OCCA 

0.65 

-1.03 

OB 

-0.38 

-1.20 

NS 

1.47 

3.51 




M 

-0.65 

0.96 




OPR 

0.44 

-0.37 




PA 

-0.18 

-0.09 
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3. The number of cases falling in the groups of the two grouping variables utilized for 
the DA runs were not seriously unequal. Relative to the first set of DA runs, the number of cases 
falling in the five respective population groups of the grouping variable CEDUC after cases were 
eliminated because of missing data were 45, 86, 255, 285, and 82. Relative to the second set of 
DA runs, the number of cases falling in the two respective population groups of the grouping 
variable COCAT after cases were eliminated because of missing data were 244 and 673. 

4. The variables in each of the X and Y sets of variables relative to the CA runs and the 
discriminating variables in the DA runs were selected to be somewhat but not highly correlated 
with each other. Informed users of CA and DA are sensitive to the variety of specification errors 
that may be committed when utilizing the techniques — including within-set variables whose 
relationships are "too high." Tables 2 through 4 contain Pearson r values relative to the within- 
set variables selected for three of the four DA and CA runs. The within-set r values ranged from 
-.30 to .37. With regard to the second set of DA runs, the r value concerning the linear 
relationship between OPR and OB was .20. 

5. The variables in each of the X and Y sets of variables relative to the CA runs were 
also selected to be somewhat correlated with the variables in the other variable set (see Tables 3 
and 4). The between-set r values ranged from -.56 to .67. 

A CA run was performed utilizing the first set of variables in the population. Selected 
“statistics” (in quotes because they pertain to a population) that pertained to this run were as 
follows: 


1. A = .76, F(6,1916) = 47.02, p = .00. 
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Table 2 

Pearson r Values Relative to Variables Selected from Variable Set 1 for DA Runs 


Variable 

NC 

NS 

Variable 

A 

HS 

OPR 

NC 


.19 

.37 

-.22 

-.09 

NS 



.12 

-.22 

-.16 

A 




-.23 

.01 

HS 





.36 
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Table 3 

Pearson r Values Relative to Variables Selected from Variable Set 1 for CA Runs 


X set Y set 

Variable Variable 



Variable 

HA 

NC 

NS 

L 

E 

X 

HA 


.01 

.03 



Set 

NC 



.19 



Y 

L 

.36 

.07 

.07 


-.25 

Set 

E 

-12 

-.27 

-.26 
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Table 4 

Pearson r Values Relative to Variables Selected from Variable Set 2 for CA Runs 



Variable 

OCCA 

X set 

Variable 

NS 

M 

Y set 

Variable 

OPR PA 

X 

OCCA 


.18 

-.21 


Set 

NS 



-.30 


Y 

OPR 

-.56 

-.16 

.15 

.16 

Set 

PA 

-.20 

-.28 

.67 



2. Solution 1 (Root 1) P values: HA = 0.71; NC = 0.44; NS = 0.46; L = 0.61; and E = - 
0.66. Root 2 p values: HA = 0.71; NC = -0.40; NS = -0.54; L = 0.84; and E = 0.80. 

3. Root 1 r s values: HA = .72; NC = .53; NS = .56; L = .77; and E = -.81. Root 2 r s 
values: HA = .69; NC = -.49; NS = -.59; L = .64; and E = .59. 

A CA run was performed utilizing the second set of variables in the population. Selected 
statistics that pertained to this run were as follows: 

1. A =.38, F(6, 1836)= 188.19, p = . 00. 

2. Root 1 p values: OCCA = -0.32; NS = -0.14; M = 0.84; OPR = 0.32; and PA = 0.90. 
Root 2 p values: OCCA = 0.97; NS = -0.04; M = 0.51; OPR = -0.96; and PA = 0.46. 
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3. Root 1 r, values: OCCA = -.51; NS = -.39; M = .94; OPR = .46; and PA = .95. Root 
2 r s values: OCCA = .86; NS = -.05; M = .31; OPR = -.89; and PA = .31. 

Relative to the DA runs, the variables in the X set were selected to be somewhat related 
to the categorical grouping variable. 

A DA run was performed utilizing the first set of variables in the population. Selected 
statistics that pertained to this run were as follows: 

1. A = .48, X 2 (20) = 547.31, = .00. 

2. Root 1 p values: NC = -0.06; NS = -0.18; A = -0.32; HS = 0.62; and OPR = 0.61. 
Root 2 p values: NC = 0.12; NS = 0.18; A = 0.77; HS = -0.01; and OPR = 0.49. Statistics for 
Roots 3 and 4 are not provided. 

3. Root 1 r, values: NC = -.24; NS = -.26; A = -.31; HS = .73; and OPR = .64. Root 2 r s 
values: NC = .33; NS = .09; A = .86; HS = -.07; and OPR = .57. Statistics for Roots 3 and 4 are 
not provided. 

A DA run was performed utilizing the second set of variables in the population. Selected 
statistics that pertained to this run were as follows: 

1. A = .50, X 2 (2) = 640. 14, p = .00. 

2. P values: OPR = 0.99; and OB = 0.07. 

3. r s values: OPR = 1.00; and OB = .18. 

SDs were computed for the 100 P and r s values obtained in the DA and CA runs for each 
of the following: 

1. Each variable in the X set relative to each of the two DA sets of runs. 


2. Each variable in both the X and Y sets relative to each of the two CA sets of runs. 
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The SDs of P and r s for each variable were compared. While similar and other 
approaches were taken in previous studies, the writers acknowledge some limitations in 
comparing the SDs of the (3 and r s distributions since the r s values can only range from -1.00 to 
1.00 while the absolute values for P may exceed 1.00. Ranges and interquartile ranges were also 
computed and compared. 

SPSS 10.0 for Windows was used for the sampling and computations. 

Results 

DA Runs 

Table 5 contains the results for the first set of DA runs. Relative to Solution 1, the SD for 
P was greater than the SD for r s for three of the paired values. For Solution 2, the SDs for all the 
P values were greater than the SDs for all the corresponding r s values. The ranges and 
interquartile ranges somewhat cross validated the SD results. Furthermore, in nine of ten cases 
the SD concerning Solution 2 was greater than the corresponding SD for Solution 1. 

Table 6 contains the results for the second set of DA runs. While low, the SD for both 
the P values were greater than the SDs for the corresponding r s values. The ranges and 
interquartile ranges generally cross validated the SD results. 

CA Runs 

Tables 7 and 8 contain the results for both the CA sets of runs. Relative to the first set of 
CA runs (Table 7), Root 1, the SD for P was lower than the corresponding SDs for all the five r s 
values. Relative to Root 2, the SD for P was lower than the corresponding SDs for two of the 
five r s values. The ranges and interquartile ranges somewhat cross validated the SD results. 
Furthermore, in seven of ten cases the SD concerning Solution 2 was less than the corresponding 


SD for Solution 1. 
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Relative to the second set of CA runs (Table 8), Root 1, the SD for P was lower than the 
corresponding SDs for all five r s values. Relative to Root 2 the SD for P was lower than the 
corresponding SDs for one of the five r s values. The ranges and interquartile ranges somewhat 
cross validated the SD results. Furthermore, in six of ten cases the SD concerning Solution 2 
was less than the corresponding SD for Solution 1. 
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Table 5 

Standard Deviation and Other Statistics Relative to Standardized and Structure Coefficients 
Concerning Variable Set 1 for DA Runs 


Variable 

Range 

P 

SD 

Skewness 

Range 

r s 

SD 

Skewness 



Solution 1 




NC 

-0.51 to 0.50 

.21 

0.25 

-.54 to .63 

.19 

1.35 


(-0.21 to 0.09) 



(-.32 to -.12) 



NS 

-0.53 to 0.62 

.24 

1.20 

-.63 to .59 

.22 

1.77 


(-0.26 to -0.01) 



(-.34 to -.18) 



A 

-0.78 to 0.79 

.28 

1.16 

-.55 to .50 

.22 

1.64 


(-0.44 to -0.09) 



(-.38 to -.14) 



HS 

-0.63 to 0.96 

.28 

-2.38 

-.74 to .98 

.31 

-3.03 


(0.48 to 0.69) 



(.56 to .76) 



OPR 

-0.73 to 0.92 

.32 

-2.55 

-.69 to .96 

.32 

-2.42 


(0.50 to 0.71) 

Solution 2 

(.49 to .69) 



NC 

-0.97 to 0.97 

.50 

-0.17 

-.59 to .91 

.39 

-0.31 

NS 

-0.64 to 0.83 

.30 

-0.29 

-.54 to .75 

.27 

-0.09 

A 

-0.80 to 1.05 

.45 

-1.18 

-.47 to .98 

.35 

-1.09 

HS 

-0.64 to 0.91 

.35 

0.14 

-.56 to .66 

.30 

-0.03 

OPR 

-0.83 to 0.94 

.37 

-0.83 

-.72 to .96 

.34 

-0.96 


Note. Due to missing data, the sample size in each of the 100 samplings was not always 100. 
The mean N was 74.56, SD = 6.20. The mean A in the 100 samplings was .36, SD = .07. The 
values within parentheses below the ranges for the first solution are the interquartile ranges. 
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Table 6 

Standard Deviation and Other Statistics Relative to Standardized and Structure Coefficients 
Concerning Variable Set 2 for DA Runs 


Variable 

Range 

P 

SD 

Skewness 

Range 

r s 

SD 

Skewness 

OPR 

0.92 to 1.05 

.02 

-0.23 

.92 to 1.00 

.01 

-2.54 


(0.98 to 1.00) 



(.99 to 1.00) 



OB 

-0.34 to 0.40 

.14 

-0.24 

-.04 to .41 

.10 

0.06 


(-0.04 to 0.16) 



(.11 to .25) 




Note. Due to missing data, the sample size in each of the 100 samplings was not always 150. 
The mean N was 91.36, SD = 5.66. The mean A in the samplings was .50, SD = .06. The values 
within parentheses below the ranges are the interquartile ranges. 
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Table 7 

Standard Deviation and Other Statistics Relative to Standardized and Structure Coefficients 
Concerning Variable Set 1 for CA Runs 


Variable 

Range 

P 

SD 

Skewness 

Range 

r s 

SD 

Skewness 



Solution 1 




HA 

-1.01 to 1.02 

.66 

-0.34 

-.96 to 1.00 

.67 

-0.39 


(-0.36 to 0.86) 



(-.34 to .88) 



NC 

-0.86 to 0.86 

.45 

0.22 

-.95 to .91 

.53 

0.20 


(-0.56 to 0.17) 



(-.65 to .29) 



NS 

-0.96 to 1.02 

.49 

0.31 

-.98 to .95 

.57 

0.21 


(-0.51 to 0.26) 



(-.62 to .36) 



L 

-1.08 to 1.10 

.65 

-0.36 

-.99 to 1.00 

.70 

-0.28 


(-0.42 to 0.85) 



(-.54 to .89) 



E 

-1.08 to 1.08 

.71 

-0.27 

-1.00 to 1.00 

.75 

-0.15 


(-0.46 to 0.83) 

Solution 2 

(-.64 to .87) 



HA 

-0.88 to 1.05 

.54 

-0.86 

-.87 to .99 

.52 

-0.73 

NC 

-0.93 to 1.00 

.51 

0.30 

-.88 to .95 

.52 

0.29 

NS 

-1.07 to 1.05 

.55 

0.16 

-.98 to 1.00 

.56 

0.24 

L 

-1.11 to 1.08 

.64 

-1.11 

-.96 to 1.00 

.56 

-0.97 

E 

-1.09 to 1.11 

.73 

-0.32 

-1.00 to 1.00 

.66 

-0.13 


Note. HA, NC, and NS constituted one variable set, and L and E constituted the second variable 
set. Due to missing data, the sample size in each of the 100 samplings was not always 100. The 
mean N was 63.15, SD = 4.02. The mean A in the 100 samplings was .67, SD = .09. 

The values within parentheses below the ranges for the first solution are the interquartile ranges. 
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Table 8 

Standard Deviation and Other Statistics Relative to Standardized and Structure Coefficients 
Concerning Variable Set 2 for CA Runs 


Variable 

Range 

P 

SD 

Skewness 

Range 

r s 

SD 

Skewness 



Solution 1 




OCCA 

-0.96 to 1.02 

.46 

0.70 

-.99 to 1.00 

.54 

0.70 


(-0.47 to 0.07) 



(-.65 to .21) 



NS 

-0.41 to 0.61 

.21 

0.38 

-.68 to .72 

.38 

0.38 


(-0.16 to 0.17) 



(-.39 to .26) 



M 

-1.01 to 1.08 

.77 

-0.55 

-1.00 to .99 

.84 

-0.54 


(-0.80 to 0.83) 



(-.81 to .94) 



OPR 

-0.99 to 0.97 

.46 

-0.64 

-1.00 to .99 

.53 

-0.73 


(-0.13 to 0.47) 



(-.11 to .62) 



PA 

-1.01 to 1.04 

.82 

-0.54 

-1.00 to 1.00 

.86 

-0.58 


(-0.86 to 0.92) 

Solution 2 

(-.85 to .97) 



OCCA 

-1.08 to 1.14 

.65 

-1.83 

-.98 to 1.00 

.57 

-1.57 

NS 

-0.45 to 0.43 

.20 

0.36 

-.58 to .52 

.24 

0.13 

M 

-1.02 to 1.09 

.52 

-0.82 

-.81 to .98 

.42 

-0.41 

OPR 

-1.19 to 1.04 

.66 

1.79 

-1.00 to .98 

.60 

1.63 

PA 

-0.89 to 1.14 

.52 

-0.57 

-.75 to .96 

.43 

-0.28 


Note. OCCA, NS, and M constituted one variable set, and OPR and PA constituted the second 
variable set. Due to missing data, the sample size in each of the 100 samplings was not always 
100. The mean N was 59.63, SD = 4.23. The mean A in the 100 samplings was .35, SD = .08. 
The values within parentheses below the ranges for the first solution are the interquartile ranges 
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Discussion 

No hypothesis was clearly supported by the results. For the DA runs, the stability of P 
was most often found to be less than the stability of the corresponding r s . However, the reverse 
was true for the CA runs. The results for the CA runs somewhat contradict the results from the 
first writer's previous studies and what the literature usually suggests. 

For the DA runs, the stabilities of the standardized and structure coefficients for the 
first solution were most often greater than the corresponding stabilities for the second solution. 
However, the reverse was generally true for the CA runs. The results for the CA runs somewhat 
contradict the results from the first writer's previous studies and what "common sense" would 
suggest. 

The results for the least complex situation studied (DA with two discriminating variables) 
suggest greater stability of P and r s than for more complex situations. Further study of this 
observation is warranted. 

With the exception of the results from the second set of DA runs, the results — as was 
true with the first writer's previous studies — suggest alarmingly low stabilities for both P and r s . 
This should not suggest that the stabilities of P and r s would typically be greater in DA than in 
CA although this generality may be true. A plausible explanation regarding the better 
performance of P and r s in the second set of DA runs is that of all the four sets studied this 
second DA set contained the lowest number of variables. That for this set the sample sizes were 
larger is thought by us to be of small effect. 

The continuing evidence gathered regarding the generally low stability of P adds to its 
unattractiveness in CA and DA. However, the writers rebuke some of the previous criticisms of 
P in that (a) it has interpretive value evidenced, in part, by its somewhat frequent use, (b) its 
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interpretation is different from the interpretation of r s , and (c) it does not appear to be even close 
to universally less stable than r s . Furthermore, in many situations provides more useful 
information than does r s . 

Standardized coefficients already have a "bad" reputation. The writers, however, suggest 
not going so far as to avoid their use but to utilize caution in their interpretation. Furthermore, if 
use of P is to be avoided similar concern would also apply to r s . Additional study of the 
difference in stability of the coefficients is warranted — especially study of their apparently often 
alarmingly low performance. Some alternatives to their standard use exist — jackknife and 
bootstrap procedures, for example — but the alternatives have their own sets of limitations. 

The writers have no explanation at this point concerning the results that pertain to the 
difference between the Solution 1 and Solution 2 stabilities of both statistics. When poorer 
stability performance of both statistics relative to Root 2 was found in the first writer's previous 
studies, the writer explained that the results were expected because roots beyond the first root 
contain more "error" as reflected by the declining R c 2 with each subsequent root. The first 
writer's numerous observations across several years suggest that this generality is most true when 
the number of roots is low. More study of this issue is also warranted. 

The somewhat unexpected results may be in part due to cases excluded from analysis due 
to missing data. The loss of the cases was assumed to be random — which is consistent with 
most multivariate data-analysis practices. However, the skewness and other statistics for the 
cases actually utilized in each CA and DA run likely differed from the statistics reported for all 
the original cases. More attention to this issue in future studies is warranted. 

The writers also considered that the unexpected results may be attributed to unique 
samples that resulted from original variables whose SDs were "too low." This prompted the 
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writers' inspection of the SDs of the original variables — which was lowest for L, SD = 0.57. 

The writers concluded that the SDs for all the original variables were sufficient. 

While the search to find the best method to study the stability of the statistics that pertain 
to this study is likely to be accompanied by frustration, researchers must have an open mind to 
the advantages and disadvantages of the several approaches — in terms of their own work as well 
as their critiques of the works of others. The writers are still not satisfied with utilizing SDs as 
the primary criterion but have as yet found no clearly more attractive alternative. 
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